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-- The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- if the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- tf NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 1 33). 

- Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1 .704(b). 

Status 

1 )^ Responsive to communication(s) filed on 7/20/2000 . 
2a)D This action is FINAL. 2b)S This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 213. 
Disposition of Claims 

4) D Claim(s) is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) [3 Claim(s) 1^9 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) ^ The specification is objected to by the Examiner. 

10)^ The drawing(s) filed on 20 July 2000 is/are: a)S accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
11 )□ The proposed drawing correction filed on is: a)D approved b)D disapproved by the Examiner. 

If approved, corrected drawings are required in reply to this Office action. 

12) E3 The oath or declaration is objected to by the Examiner. 
Priority under 35 U.S.C. §§ 119 and 120 

13) E3 Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 

a)KI All b)D Some*c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2.Q Certified copies of the priority documents have been received in Application No. . 

3.I3 Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 

14) D Acknowledgment is made of a claim for domestic priority under 35 U.S.C. § 119(e) (to a provisional application). 

a) □ The translation of the foreign language provisional application has been received. 

15) D Acknowledgment is made of a claim for domestic priority under 35 U.S.C. §§ 120 and/or 121. 
Attachment(s) 

1) £3 Notice of References Cited (PTO-892) 4) □ Interview Summary (PTO-413) Paper No(s). . 

2) □ Notice of Draftsperson's Patent Drawing Review (PTO-948) 5) D Notice of Informal Patent Application (PTO-152) 

3) S Information Disclosure Statemeni(s) (PTO-1449) Paper No(s) 3. 6) □ Other: 
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DETAILED ACTION 

Oa th/Declara tion 

1 ) The oath or declaration is defective. A new oath or declaration in compliance 
with 37 CFR 1.67(a) identifying this application by the application number and filing date 
is required. See MPEP §§ 602.01 and 602.02. 

The oath or declaration is missing. Please submit or resubmit another copy. 

Specification 

2) The title on Page 1 of the Description is misspelled, please correct. The title 
currently reads "Scoring of Test Units," and should read , "Scoring of Text Units." 

Claim Rejections - 35 USC § 103 

3) The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3-1) Claim 1, 2, 3, 7 and 9 is rejected under 35 U.S.C. 103(a) as being 
unpatentable over Inaba , U.S. Patent No. 6,154,737 (issued Nov. 28, 2000), in view 
of Chen. European Patent Application, No. EP 0,741,364 A1 (published Jun. 6, 
1996). 

In regard to independent claim 1, Inaba teaches a word frequency index for 
storing a frequency of occurrence of a dictionary word in the target document (Inaba, 
column 3, lines 64-67; compare with claim 1 "forming a structure for... strings, in which 
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structure a string is associated with each pair of text units in which the string occurs; 
...summing the number of occurrences of each other text unit in the same structure..."). 
Specifically, Inaba's teaching shows a structure ("frequency index") which stores the 
frequency of occurrences of a word (one skilled in the art would have been motivated to 
develop an index/structure that stores strings of words, as claimed in the invention 
based on the teachings of Inaba.) By comparing frequency of a word with a document, 
Inaba teaches a comparison of two textual objects (equivalent to a pair of text), and 
stores the comparison data in the frequency index. Words and document text in Inaba's 
teachings are deemed text units. Inaba does not specifically teach a structure for the 
strings associated with pairs of text units. However, Chen does teach a phrase list that 
stores candidate phrases (Chen, column 6, lines 55-60; compare with claim 1 
"...structure for each of at least some of said string "). Phrases (in Chen) are equivalent 
to a plurality of words (as claimed). 

Additionally, Inaba teaches a frequency score calculating means for calculating 
the frequency of a text word occurrence in a particular document (Inabab, column 4, 
lines 52-56; compare with claim 1 "...form an individual score for each pair of text 
units"). Specifically, in Inaba, the text words and the document text form a pair of text 
units that are compared and the number of occurrences of the text words in the 
document form the score for the comparison of that particular text unit and the 
document text. 

Additionally, Inaba teaches a word co occurrence index for storing word co 
occurrence (Inaba, column 4, lines 17-23, lines 30-34; compare with claim 1 "form a 
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final score for each pair of text units to determine how many times any string is shared 
between each pair ..."). Specifically, a co occurrence score is the quantity of the word 
appearances when compared to the document text. The quantity is the final score for 
the comparison of that pair of text units. Inaba also teaches a degree of coincidence 
between the document text and the words (Inaba, lines 53-56; compare with claim 
language "...form a final score..."). 

It would have been obvious to one of ordinary skill at the time of the invention to 
make a text index structure, as taught in Inaba in view of Chen, that contains strings 
and the text they are associated with, providing Inaba the benefit of indexing candidate 
phrases (Chen's teachings) rather than just words (Inaba's teachings) and providing the 
benefit of having an organized method of storing strings and the text units they 
associate with. It would also have been obvious to one of ordinary skill in the art at the 
time of the invention to interpret Inaba's system to be used for calculating individual 
scores of each pair of text units that contain a plurality of text, providing the benefit of 
comparing strings that represent a plurality of text units. Furthermore, it would have 
been obvious to one of ordinary skill at the time of the invention to interpret Inaba's 
teaching to be used for forming a final score for each string pair, providing the benefit of 
determining how many times any string is shared between each string pair to form a 
final score of co occurrence. 

In regard to dependent claim 2, Inaba teaches a ranking means for rearranging 
the target document in the order of score obtained by the text unit comparison (as 
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described in claim 1 above) (Inaba, column 4, lines 63-65; compare with claim 2 
"...ranking the text units on the basis of individual scores"). 

In regard to dependent claim 3, Inaba fails to teach sentences without stop 
words, and stem-index records which corresponds to stem words . However, Chen 
teaches an automatic method for breaking document into multi-word phrases free of 
stop words (Chen, paragraph (57); Figure 2, items 43-58; column 1, line 45 shows 
'sentences'; compare with claim language "...text units are sentences... strings are 
words forming said sentences, ...removing stop-words"). It would have been obvious to 
one of ordinary skill at the time of the invention to think of Chen's teachings of multi- 
word phrases as equivalent to sentences (as claimed). 

Additionally, Chen selects as the key phrases the candidate phrases occurring 
most frequently (Chen, column 2, lines 34-36; column 5, lines 45-47, figure 2, item 44; 
lines "...stemming each remaining word and indexing the sentences prior to carrying out 
said summing step...). Additionally, Chen teaches a candidate phrase list which is a list 
of key phrases, excluding the stop words (Chen, column 6, lines 55-59, column 7, lines 
1-4; compare with claim language "stem-index records. ..comprising stem words and 
one or more indexes corresponding to sentences ..."). A candidate phrase list (of 
Chen) keeps only relevant words that occur in sentences. It would have been obvious 
to one of ordinary skill at the time of the invention to include Chen's teachings with 
Inaba to develop a method of removing stop words from sentence, stemming each 
remaining word and thereafter creating a stem-index structures comprising stem words 
and the index corresponding to sentences in which stemmed words occur, providing the 
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benefit of removing needless words and indexing only the relevant strings/sentences 
with their key words. 

In regard to dependent claim 7, Inaba fails to teach the limitations of calculating 
a level (the highest score in relation to a threshold value) for each text unit and fails to 
teach a final score. However, Chen teaches a processor that compares the number of 
occurrences of a word within the document to a threshold and excludes those candidate 
terms that fall below the threshold (Chen, column 6, lines 4-10; compare with claim 
language "calculating a level for each text unit in addition to final score ... level indicates 
the value of the highest of said individual scores in relation to a threshold value."). 
Specifically, to determine if a term occurs more than the threshold, the Chen processor 
must maintain a counter that increments whenever an occurrence is found common in 
comparing text units. It would have been obvious to one of ordinary skill in the art at 
the time of the invention, to include Chen's teachings with Inaba, to develop a method 
for calculating a score, which is the number of occurrences of a word, and thereafter 
calculating the highest of individual scores in relation to a threshold value, providing the 
user the benefit of being able to generate the final scores of string comparisons and 
care only about those scores that are above a certain level. 

In regard to dependent claim 9, Inaba teaches a ranking means for rearranging 
the target document in the order of score obtained by the text unit comparison and 
displays it to the user (as described in claim 1 above) (Inaba, column 4, lines 63-65; 
compare with claim 9 "a system for ranking text units in a text, the system comprising a 
data processor"). It would have been obvious to one of ordinary skill to implement the 
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system as taught in Inaba to include a data processor programmed to perform the 
textual scoring/ranking operations, because data processors were very common means 
of performing operations such as described in the claim. The data processor provided 
the benefit of faster processing than doing it without a data processor. 
3-2) Claims 4 and 8 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Inaba , in view of Chen , as applied to claim 1 above, and further in view of 
Liddy , U.S. Patent No. 5,873,056 (issued Feb. 16, 1999). 

In regard to dependent claim 4, Inaba fails to teach words associated with 
subject codes. However, Liddy teaches a system that generates a subject vector 
representation of the text which may be a sentence, and uses subject codes that are 
obtained from a lexical database and assigned from the database (Liddy, Abstract 
section; compare with claim language "word being associated with one or more subject 
codes representing subjects with which said word is associated, and wherein said 
strings are subject codes associated with said words"). It would have been obvious to 
one of ordinary skill in the art at the time of the invention to include Liddy's teachings 
with Inaba to associate subject codes with words and strings, providing the benefit of 
categorizing strings and words into subject areas for more efficient searching of the 
strings and words. 

In regard to dependent claim 8, Inaba does not expressly teach a storage 
medium. However, Liddy teaches a natural language processing system with a lexical 
database (Liddy, Abstract section; compare with claim language "a storage medium 
contain a program for controlling a programmable data process to perform a 
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method..."). The Liddy system teaches a language processing program that controls 
text from documents, maintains a lexical data storage and generates subject codes. 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to interpret Liddy's lexical database system in view of Inaba to develop a text 
scoring system with a database coupled with a database management system, because 
at the time of the invention, databases with management systems were well known in 
the industry as a storage medium with a programmable data process to perform a 
method (for example, Oracle, Sybase,...). Furthermore, in Liddy, the qualification 
'lexical' shows that the database is not a general purpose storage medium, rather, it is a 
storage medium for lexical data (similar to the claimed invention). Inherently, that 
shows that the database contains a programmable data processor for analyzing the 
data to be stored in the lexical database. This system would have provided the benefit 
of having a programmable data store, 

3-3) Claims 5 and 6 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Inaba . in view of Chen , as applied to claim 1 above, and in view of 
Liddy as applied to claim 4 above, and further in view of Baker , U.S. Patent 
No. 5,680,511 (Oct. 21, 1997). 

In regard to dependent claim 5, Inaba fails to teach a system that breaks down 
words into smaller components and disregards strings if the same word spelling is 
associated with the same subject code in a pair of string pairs. However, Baker teaches 
an ambiguity recognition system that recognizes a sequence of words in a document 
before breaking the words into smaller components and analyzing each word 
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individually (Baker, Abstract; column 1, lines 50-57; compare with claim language "word 
spelling associated with each occurrence of a subject code in a text unit... occurrences 
of the same subject code in a pair of text units are disregarded if the same word spelling 
is associated with said same subject code in said pair of text units"). The examiner 
interprets that the claim is trying to achieve a method to reduce duplication in text units 
by breaking words down in smaller units in order to analyze portions (ie., analysis of 
letters or binary sequence of a word). Similar to this interpretation, Baker teaches a 
word recognition system that breaks down a text unit into it's components in order to 
compare with other text units of similar word by comparing the components. 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to include Baker with Inaba to develop a system that breaks down words into 
smaller components and analyzes the components of the words (ie., analyzing spelling, 
binary code,...) to disregard words that have the same spelling as the subject code in 
the text unit pair, providing the benefit of detecting duplication and ambiguity of text 
words associated with subject codes. 

In regard to dependent claim 6, Inaba fails to teach a system that does not 
carry out disregarding of subject codes when the codes relate to only a single word 
spelling in the word text. Baker teaches an ambiguity recognition system which 
recognizes ambiguous words that occur within a passage of words (Baker, Abstract; 
compare with claim language "disregarding occurrences.... single word spelling in the 
word text"). The examiner interprets the goal of this claim is to develop a method to 
reduce ambiguity when the system interprets word meanings and to not remove words 
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that are not ambiguous. Similar to this interpretation, Baker teaches a word recognition 
system that reduces ambiguities amongst words in a passage of words. It would have 
been obvious to one of ordinary skill in the art at the time of the invention to include 
Baker's teaching with Inaba to develop a system that does not disregard occurrences of 
subject codes which relate to only a single word spelling in the word text, providing the 
benefit of having a data store of text units with only a single word spelling in the word 
text minimizing ambiguity. 



Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Gautam Sain whose telephone number is 703-305- 
8777. The examiner can normally be reached on M-F 9-5 EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Joseph Feild can be reached on (703)305-9792. The fax phone number for 
the organization where this application or proceeding is assigned is (703) 872-9306. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is (703)305- 
3900. ^ 



Conclusion 
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PRIMARY EXAMINER 
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